Terms
Data Visualization Types
Term | Description |
---|---|
Area chart | A data visualization that uses individual data points for a changing variable connected by a continuous line with a filled-in area underneath. |
Box plot | A data visualization that displays the distribution of values along an x-axis. |
Bubble chart | A data visualization that displays individual data points as bubbles, comparing numeric values by their relative size. |
Bullet graph | A data visualization that displays data as a horizontal bar chart moving toward a desired value. |
Circle view | A data visualization that shows comparative strength in data. |
Column chart | A data visualization that uses individual data points for a changing variable, represented as vertical columns. |
Combo chart | A data visualization that combines more than one visualization type. |
Density map | A data visualization that represents concentrations, with color representing the number or frequency of data points in a given area on a map. |
Distribution graph | A data visualization that displays the frequency of various outcomes in a sample. |
Donut chart | A data visualization where segments of a ring represent data values adding up to a whole. |
Filled map | A data visualization that colors areas in a map based on measurements or dimensions. |
Gantt chart | A data visualization that displays the duration of events or activities on a timeline. |
Gauge chart | A data visualization that shows a single result within a progressive range of values. |
Highlight table | A data visualization that uses conditional formatting and color on a table. |
Packed bubble chart | A data visualization that displays data in clustered circles. |
Symbol map | A data visualization that displays a mark over a given longitude and latitude. |
Heat map | A data visualization that uses color contrast to compare categories in a dataset. |
Histogram | A data visualization that shows how often data values fall into certain ranges. |
Line graph | A data visualization that uses one or more lines to display shifts or changes in data over time. |
Pie chart | A data visualization that uses segments of a circle to represent the proportions of each data category compared to the whole. |
Pivot chart | A chart created from the fields in a pivot table. |
Scatterplot | A data visualization that represents relationships between different variables with individual data points without a connecting line. |
Database and SQL Concepts
Term | Description |
---|---|
CASE | A SQL statement that returns records that meet conditions by including an if/then statement in a query. |
CAST | A SQL function that converts data from one datatype to another. |
COUNT | A spreadsheet function that counts the number of cells in a range. |
COUNTA | A spreadsheet function that counts the total number of values within a range that meet specified criteria. |
COUNTIF | A spreadsheet function that returns the number of cells in a range that match a specified value. |
COUNT DISTINCT | A SQL function that only returns the distinct values in a specified range. |
CREATE TABLE | A SQL clause that adds a temporary table to a database that can be used by multiple people. |
DROP TABLE | A SQL clause that removes a temporary table from a database. |
GROUP BY | A SQL clause that groups rows that have the same values from a table into summary rows. |
INNER JOIN | A SQL function that returns records with matching values in both tables. |
JOIN | A SQL function that is used to combine rows from two or more tables based on a related column. |
LEFT JOIN | A SQL function that will return all the records from the left table and only the matching records from the right table. |
RIGHT JOIN | A SQL function that will return all records from the right table and only the matching records from the left. |
OUTER JOIN | A SQL function that combines RIGHT and LEFT JOIN to return all matching records in both tables. |
SELECT | The section of a query that indicates from which column (s) to extract the data. |
SELECT INTO | A SQL clause that copies data from one table into a temporary table without adding the new table to the database. |
WHERE | The section of a query that specifies criteria that the requested data must meet. |
Statistical and Mathematical Concepts
Term | Description |
---|---|
AVERAGE | A spreadsheet function that returns an average of the values from a selected range. |
AVERAGEIF | A spreadsheet function that returns the average of all cell values from a given range that meet a specified condition. |
CORRELATION | The measure of the degree to which two variables change in relationship to each other. |
CONFIDENCE INTERVAL | A range of values that conveys how likely a statistical estimate reflects the population. |
CONFIDENCE LEVEL | The probability that a sample size accurately reflects the greater population. |
COUNT | A spreadsheet function that counts the number of cells in a range. |
COUNTA | A spreadsheet function that counts the total number of values within a range that meet specified criteria. |
COUNTIF | A spreadsheet function that returns the number of cells in a range that match a specified value. |
COUNT DISTINCT | A SQL function that only returns the distinct values in a specified range. |
MAX | A function that returns the largest numeric value from a range of cells. |
MIN | A spreadsheet function that returns the smallest numeric value from a range of cells. |
SUM | A function that adds the values of a selected range of cells. |
SUMIF | A spreadsheet function that adds numeric data based on one condition. |
SUMPRODUCT | A function that multiplies arrays and returns the sum of those products. |
Data Management and Analysis Terms
Term | Description |
---|---|
Data aggregation | The process of gathering data from multiple sources and combining it into a single, summarized collection. |
Data analysis | The collection, transformation, and organization of data in order to draw conclusions, make predictions, and drive informed decision-making. |
Data analyst | Someone who collects, transforms, and organizes data in order to draw conclusions, make predictions, and drive informed decision-making. |
Data analytics | The science of data. |
Data anonymization | The process of protecting people's private or sensitive data by eliminating identifying information. |
Data bias | When a preference in favor of or against a person, group of people, or thing systematically skews data analysis results in a certain direction. |
Data design | How information is organized. |
Data-driven decision-making | Using facts to guide business strategy. |
Data ecosystem | The various elements that interact with one another in order to produce, manage, store, organize, analyze, and share data. |
Data governance | A process for ensuring the formal management of a company’s data assets. |
Data integrity | The accuracy, completeness, consistency, and trustworthiness of data throughout its lifecycle. |
Data interoperability | The ability to integrate data from multiple sources and a key factor leading to the successful use of open data among companies and governments. |
Data life cycle | The sequence of stages that data experiences, which include plan, capture, manage, analyze, archive, and destroy. |
Data mapping | The process of matching fields from one data source to another. |
Data merging | The process of combining two or more datasets into a single dataset. |
Data model | A tool for organizing data elements and how they relate to one another. |
Data privacy | Preserving a data subject’s information any time a data transaction occurs. |
Data replication | The process of storing data in multiple locations. |
Data security | Protecting data from unauthorized access or corruption by adopting safety measures. |
Data strategy | The management of the people, processes, and tools used in data analysis. |
Data transfer | The process of copying data from a storage device to computer memory or from one computer to another. |
Data validation | A tool for checking the accuracy and quality of data. |
Data visualization | The graphical representation of data. |
Database | A collection of data stored in a computer system. |
Dataset | A collection of data that can be manipulated or analyzed as one unit. |